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To accelerate the development of drugs against severe acute 
respiratory syndrome (SARS), we constructed a homology 
model of the SARS coronavirus main protease using our model- 
ing software, FAMS Ligand&Complex, and released it before 
the X-ray structure was solved. The X-ray structure showed our 
model as accurately predicted and useful for structure based 
drug design. 
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The first case of severe acute respiratory syndrome 
(SARS) has been identified as being in China in late 2002, 
and thereafter SARS rapidly spread to many countries in 
early 2003. Scientists in all areas are now working to develop 
effective drugs against SARS, because it is feared that there 
could be another outbreak of SARS either in this or the next 
winter season. The cause of SARS was identified as a novel 
coronavirus; its main protease (SARS-CoV M?"°) is the pri- 
mary target for drugs because it plays an important role in 
virus replication. Though the three-dimensional (3D) struc- 
ture of the SARS-CoV MP" is necessary to accelerate the 
discovery of new drugs, the 3D structure was not solved ex- 
perimentally when the first complete genome sequences of 
the SARS-CoV were reported on May 1, 2003.'” Therefore, 
we constructed a homology model of SARS-CoV MP?" using 
the computer software FAMS Ligand&Complex [Takeda- 
Shitaka et al., in preparation] and released the model struc- 
ture at http://www.pd-fams.com/ on May 9, 2003.7) Many 
scientists started to carry out structure based drug design 
using our model without waiting for the X-ray structure to be 
solved. After our model was released, the X-ray structure of 
SARS-CoV MP” of Urbani strain was released from the Pro- 
tein Data Bank (PDB)* (PDB ID: 1Q2W) on July 29, 2003. 
In this paper, we compare our model with the X-ray structure 
of SARS-CoV M?"” to evaluate the usefulness of the model in 
structure based drug design. 

We constructed SARS-CoV M”° model based on the se- 
quence of Urbani strain.') Searching for reference protein 
and sequence alignment was carried out by reverse PSI- 
BLAST (RPS-BLAST).» The X-ray structure of the trans- 
missible gastroenteritis coronavirus (TGEV) MP?" (PDB ID: 
1LVO) was selected as the reference protein for SARS-CoV 
M?"?. Sequence identity between SARS-CoV M?” and TGEV 
MP? was 44.5%, and E-value (810 °°) was low enough that 
TGEV M?® could be used as the reference protein. There 
were three 1- or 2-residue insertions in SARS-CoV M?”. 
Because the reference protein forms a tight dimer in the 
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crystal, we constructed a homodimer model of SARS-CoV 
MP" based on the alignment using FAMS Ligand&Complex. 

First, we checked the quality of the stereochemistry of our 
model. No unfavorable contacts between the atoms and no 
unnatural chiral centers were observed, and there were no 
bad steric clashes that prevented close interaction of the two 
monomers in the dimer. In the Ramachandran plot of the 
main-chain ¢@—@ angles made by the program PROCHECK,” 
almost all of the nonglycine residues were in the most fa- 
vored or allowed regions. All the @ angles were trans-planar. 

Second, we superimposed our model structure to the X-ray 
structure to check the similarity of the structures. Table | and 
Fig. | show that our model structure is very similar to the X- 
ray structure. As we predicted, the X-ray structure of SARS- 
CoV MP” forms a homodimer. Each monomer molecule has 
three domains I, II and III (residues 8—101, 102—184 and 
201—301, respectively). Domains I and II (active site do- 
main) are B-barrel domains. The active site region is located 
in a cleft between domains I and II. Domain III is an a@-heli- 
cal domain. As shown in Table 1, root mean square devia- 
tions (rmsds) for each domain (active site domain and a-he- 
lical domain) are small. Especially in the active site region, 
which is the most important region for structure based drug 
design, rmsd is very small. This means that the active site re- 
gion is quite accurately predicted. Rmsds for all residues 
(monomer and homodimer) are relatively higher than those 
of the active site region, and the active site and a@-helical do- 
mains, probably because the dimer structure between two 
monomers and the domain structure between the active site 
and a-helical domains of the X-ray structure (PDB ID: 
1Q2W) are slightly different from those of the reference 
structure (PDB ID: ILVO). 

Anand ef al. predicted a monomer homology model of 
SARS-CoV MP? using Insight II (Accelrys Inc.), and re- 
ported it in the journal Science online on May 13, 2003.” 
Their model was released at PDB (PDB ID: 1P9T) on May 
20, 2003. We superimposed their model structure to the X- 
ray structure and calculated the rmsds (Table 1). The result 


Table 1. Root Mean Square Deviations for Superposition between Models 
and the X-Ray Structure of SARS-CoV MP° 
rmsd (A) 
Superimposed residues” Ca atom All atom 
Our model” 1P9T Our model”) 1P9T 
Active site region” 0.78 1,24? 1.42” 1.66” 
Active site domain _ 1.16 1.99” 1.89” 2.55° 
(domains I and II)/” 
at-Helical domain 1.689 — 2.75° 2.679 —— 3.60° 
(domain III)* 
All residues (monomer) 2.54” 3.48” 3.10” 3.98” 
All residues (homodimer) 2.82” — 3.319 — 


a) Residues that are used in superposition and rmsd calculations. b) We con- 
structed six homodimer models because the simulated annealing in the refinement pro- 
cedure of FAMS Ligand&Complex gives various solutions. c) Catalytic and binding 
site residues 41, 140, 143—145, 161—168, 172 and 187—191. d) Averaged value of 
twelve rmsds between twelve monomers in six dimer models and the corresponding 
chain of the X-ray structure. e) Averaged value of two rmsds between chain A of 
1P9T and chains A and B of 1Q2W. /) Residues 8—184. g) Residues 201—301. 
h) Averaged value of six rmsds between six dimer models and one dimer X-ray struc- 
ture. 
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shows that the rmsds of our model are smaller than those of 
their model; indicating that our model is more accurate than 
theirs. In homology modeling, the accuracy of the model 
generally depends on the quality of the alignment and on the 
model construction procedure. In the case of SARS-CoV 
M?"°, the accuracy of the model depends mainly on the latter, 
since there is very little error in the alignment because of the 
high sequence identity between the target and reference pro- 
teins. The good result of our modeling study probably comes 
from the ability of our software FAMS Ligand&Complex to 


(A) 


Fig. 1. 
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accurately predict structure. FAMS Ligand&Complex was 
developed by improving the procedures of FAMS,” a full au- 
tomatic modeling system for individual proteins. The high 
ability of FAMS, and also FAMS Ligand&Complex, are 
mainly derived from the optimization protocol based on iter- 
ative cycles of side chain and main chain optimization. This 
protocol enables the conservation of the side chain torsional 
angles and main chain conformation within homologous pro- 
teins. 

Third, we carried out normal mode analyses of our model, 


(B) 


Superposition of Our Model and the X-Ray Structure of SARS-CoV MP? 


Our model is shown in cyan and magenta. The X-ray structure is shown in yellow and orange. (A) Homodimer structure of SARS-CoV M?®. (B) Active site region of SARS- 


CoV MP. 
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Fig. 2. Fluctuations of Ca Atoms of the Models and the X-Ray Structure of SARS-CoV MP? 


The red, blue and black lines denote Ca@-atom fluctuations of our model, Anand’s model (PDB ID: I1P9T) and the X-ray structure (PDB ID: 1Q2W), respectively. Secondary 
structure assignment (@-helix or B-sheet) was done by STRIDE.'*) The magenta, pink and orange lines denote the @-helix regions of our model, Anand’s model and the X-ray 
structure, respectively. The green, cyan and yellow green lines denote the B-sheet regions of our model, Anand’s model and the X-ray structure, respectively. (A) Fluctuations in the 
monomer structures of our model, Anand’s model and the X-ray structure. In our model and the X-ray structure, the fluctuations were determined by averaging fluctuation-data of 
six energy-optimized structures of one molecule and those of the other molecule (averaging total twelve energy-optimized structures). In Anand’s model, the fluctuations were de- 
termined by averaging fluctuation-data of twelve energy-optimized structures of a monomer. (B) Fluctuations in the homodimer structures of our model and the X-ray structure. To 
examine the inner motion of each molecule, the Eckart’s conditions were applied. The fluctuations were determined by averaging fluctuation-data of six energy-optimized structures 
of one molecule and those of the other molecule (averaging total twelve energy-optimized structures). 


May 2004 


Anand’s model and the X-ray structure by the method using 
dihedral angles described in earlier papers from our labora- 
tory'°—'*) to evaluate the dynamic characteristics of the mod- 
els. Before carrying out normal mode analyses, some re- 
sidues in the X-ray structure, which were disordered and not 
included in 1Q2W, were modeled using FAMS Ligand&Com- 
plex and the CHIMERA modeling system.'®'” The normal 
mode analyses were carried out on both homodimer and 
monomer structures. As shown in Fig. 2A, the results of nor- 
mal mode analyses for monomer structures of our model, 
Anand’s model and the X-ray structure indicated that the dy- 
namic characteristics of our model were very similar to those 
of the X-ray structure; but that Anand’s model was not as sta- 
ble as the X-ray structure. Anand’s model is very unstable in 
the loop regions and those regions where the secondary 
structures (c-helix and B-sheet) are broken. In our model, 
the secondary structures are accurately predicted. The nor- 
mal mode analyses for homodimer structures were carried 
out on our model and the X-ray structure because Anand’s 
model is a monomer. To examine the inner motion of each 
molecule, the Eckart’s conditions'®) were applied. As shown 
in Fig. 2B, the dynamic characteristics of our model are simi- 
lar to those of the X-ray structure. These results showing the 
similarity of our model to the X-ray structure (Figs. 2A and 
B) are very significant, because proteins must be treated flex- 
ible in consideration of the induced fit concept when protein— 
ligand or protein-protein docking is carried out. 

In conclusion, our model proved to be accurately pre- 
dicted. Moreover, our model is, to our knowledge, the first 
structure that was opened to public access. It is sure that 
structural information obtained from our model was much 
more useful than sequence information alone before the X- 
ray structure was solved. There are many target proteins like 
SARS-CoV M?° for which the sequence is available but the 
3D structure has not been solved; therefore, to accelerate 
drug discoveries, homology modeling methods that can con- 
struct models accurately (and quickly if possible) are indis- 
pensable for structure based drug design. 
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